Decision Boundary Partitioning: Variable Resolution Model-Free Reinforcement Learning

نویسنده

  • Stuart I. Reynolds
چکیده

Reinforcement learning agents attempt to learn and construct a decision policy which maximises some reward signal. In turn, this policy is directly derived from long-term value estimates of state-action pairs. In environments with real-valued state-spaces, however, it is impossible to enumerate the value of every state-action pair, necessitating the use of a function approximator in order to infer state-action values from similar states. Typically, function approximators require many parameters for which suitable values may be diicult to determine a-priori. Traditional systems of this kind are also then bound to the xed limits imposed by the initial parameters, beyond which no further improvements are possible. This paper introduces a new method to adaptively increase the resolution of a discretised action-value function based upon which regions of the state-space are most important for the purposes of choosing an action. The method is motivated by similar work by Moore and Atkeson but improves upon the existing techniques insofar as it: i) is applicable to a wider class of learning tasks, ii) does not require transition or reward models to be constructed and so can also be used with a variety of model-free reinforcement learning algorithms, iii) continues to improve upon policies even after a feasible solution to the learning problem has been found.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Elastic Resource Management with Adaptive State Space Partitioning of Markov Decision Processes

Modern large-scale computing deployments consist of complex applications running over machine clusters. An important issue in these is the offering of elasticity, i.e., the dynamic allocation of resources to applications to meet fluctuating workload demands. Threshold based approaches are typically employed, yet they are difficult to configure and optimize. Approaches based on reinforcement lea...

متن کامل

Variable Resolution Hierarchical RL

The contribution of this paper is to introduce heuristics, that go beyond safe state abstraction in hierarchical reinforcement learning, to approximate a decomposed value function. Additional improvements in time and space complexity for learning and execution may outweigh achieving less than hierarchically optimal performance and deliver anytime decision making during execution. Heuristics are...

متن کامل

Mulitagent Reinforcement Learning in Stochastic Games with Continuous Action Spaces

We investigate the learning problem in stochastic games with continuous action spaces. We focus on repeated normal form games, and discuss issues in modelling mixed strategies and adapting learning algorithms in finite-action games to the continuous-action domain. We applied variable resolution techniques to two simple multi-agent reinforcement learning algorithms PHC and MinimaxQ. Preliminary ...

متن کامل

A Dynamic Tree Structure for Incremental Reinforcement Learning of Good Behavior

This paper addresses the idea of learning by reinforcement, within the theory of behaviorism. The reason for this choice is its generality and especially that the reinforcement learning paradigm allows systems to be designed, which can improve their behavior beyond that of their teacher. The role of the teacher is to deene the reinforcement function, which acts as a description of the problem t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999